An Efficient Sequential Frequent Pattern Analysis Using DBCA
نویسندگان
چکیده
In this work, the practical problem of frequent-itemset discovery in data-stream environments which may suffer from data overload. The main issues include frequent-pattern mining and data-overload handling. Therefore, a mining algorithm together with Separate dedicated overload-handling mechanisms is proposed. The algorithm DBCA (Dynamic Base Combinatorial Algorithm) extracts basic information from streaming data and keeps the information in its data structure. The DBCA algorithm extracts base information from data streams in a dynamic way. More specifically, it keeps base information on a data stream with the size concerning the average length n of transactions. It could effectively manage data overload with the overload-handling mechanisms. Our results may leads to a possible solution for sequential frequent-pattern mining in dynamic streams, the Sliding window by pruning the excess of incoming data and dealing only with the trimmed data, not by processing on the full amount of incoming data. Depending on how overloading data can be trimmed, there may be various policies on load shedding, and we have described three such policies. The proposed policies, although possess different properties, have all been verified by the experiment to be effective.
منابع مشابه
Mining Constraint-based Multidimensional Frequent Sequential Pattern in Web Logs
In this paper we introduce an efficient strategy for discovering Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in ...
متن کاملEfficient Analysis of Pattern and Association Rule Mining Approaches
The process of data mining produces various patterns from a given data source. The most recognized data mining tasks are the process of discovering frequent itemsets, frequent sequential patterns, frequent sequential rules and frequent association rules. Numerous efficient algorithms have been proposed to do the above processes. Frequent pattern mining has been a focused topic in data mining re...
متن کاملApproximate Frequent Pattern Mining
Frequent pattern mining has been a focused theme in data mining research and an important first step in the analysis of data arising in a broad range of applications. The traditional exact model for frequent pattern requires that every item occurs in each supporting transaction. However, real application data is usually subject to random noise or measurement error, which poses new challenges fo...
متن کاملA Dilation-based Clustering Algorithm for Anti-Reflection Glass Inspection
This paper develops an efficient and effective dilation-based clustering algorithm (DBCA) for Anti-Reflection (AR) glass defect detection using run-length encoding (RLE). The fundamental concept of dilation-based connectivity and its limitation are described in the beginning. Subsequently, the architecture of DBCA is constructed in the following procedures: (1) run-length encoding, (2) RLE-base...
متن کاملUsing Answer Set Programming for pattern mining
Serial pattern mining consists in extracting the frequent sequential patterns from a unique sequence of itemsets. This paper explores the ability of a declarative language, such as Answer Set Programming (ASP), to solve this issue efficiently. We propose several ASP implementations of the frequent sequential pattern mining task: a non-incremental and an incremental resolution. The results show ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015